OIPE 



ENTERED 



RAW SEQUENCE LISTING DATE: 11/19/2001 

PATENT APPLICATION: US/09/965,807 TIME: 14:24:06 

Input Set : N:\Crf3\RULE60\09965807.txt 
Output Set: N:\CRF3\11192001\I965807 .raw 

SEQUENCE LISTING 

4 (1) GENERAL INFORMATION: 

6 (i) APPLICANT: Matalon, Reuben 

7 Kaul, Rajinder 

8 Gao, Guang Ping 

9 Balamurugan, Kuppareddi 

10 Michals-Matalon, Kimberlee 

12 (ii) TITLE OF INVENTION: Aspartoacylase Gene, Protein, and 

13 Methods of Screening for Mutations Associated with 
Canavan 

14 Disease 
16 (iii) NUMBER OF SEQUENCES: 27 

18 (iv) CORRESPONDENCE ADDRESS: 

19 (A) ADDRESSEE: Millen, White, Zelano & Branigan, P.C. 

20 (B) STREET: 2200 Clarendon Boulevard, Suite 1400 

21 (C) CITY: Arlington 

22 (D) STATE: Virginia 

23 (E) COUNTRY: U.S.A. 

24 ( F) ZIP: 22201 

26 (V) COMPUTER READABLE FORM: 

27 (A) MEDIUM TYPE: Floppy disk 

28 (B) COMPUTER: IBM PC compatible 
2 9 (C) OPERATING SYSTEM: PC -DOS/MS -DOS 
30 (D) SOFTWARE: Patentln Release #1.0, Version #1.25 
32 (vi) CURRENT APPLICATION DATA: 

C --> 33 (A) APPLICATION NUMBER: US/09/965,807 

C--> 34 (B) FILING DATE: 01-Oct-2001 

40 (C) CLASSIFICATION: 

37 (Vii) PRIOR APPLICATION DATA: 

38 (A) APPLICATION NUMBER: US 08/128,020 

39 (B) FILING DATE: 29-SEP-1993 

42 (viii) ATTORNEY/AGENT INFORMATION: 

43 (A) NAME: Hamlet-King, Diana 

44 (B) REGISTRATION NUMBER: 33,302 

45 (C) REFERENCE/DOCKET NUMBER: Shutt 1 

47 (ix) TELECOMMUNICATION INFORMATION: 

48 (A) TELEPHONE: 703-243-6333 

49 (B) TELEFAX: 703-243-6410 

50 (C) TELEX: 64191 
53 (2) INFORMATION FOR SEQ ID NO: 1: 

55 (i) SEQUENCE CHARACTERISTICS: 

56 (A) LENGTH: 1435 base pairs 

57 (B) TYPE: nucleic acid 

58 (C) STRANDEDNESS : double 

59 (D) TOPOLOGY: linear 

62 (ix) FEATURE: 

63 (A) NAME/KEY: CDS 

64 (B) LOCATION: 159.. 1097 
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70 (xi) 


SEQUENCE 


! DESCRIPTION 


: SEQ ID 


• NO: 


1: 














72 TTGTAACAGA AAATTAAAAT 


ATACTCCACT 


CAAGGGAATT CTGTACTTTG CCCTTTTGGT 


60 


74 AAAGTCTCAT TTACATTTCT 


AAACCTTTCT 


TAAGAAAATC GAATTTCCTT TGATCTCTCT 


120 


76 TCTGAATTGC AGAAATCAGA 


TAAAAACTAC 


TTGGTGAA 


> ATG 


ACT 


TCT 


TGT 


CAC 




173 


77 




















Met 


Thr 


Ser 


Cys 


His 






78 




















1 








5 






80 ATT GCT GAA GAA CAT ATA CAA AAG GTT GCT ATC TTT GGA GGA ACC CAT 


221 


81 He Ala Glu Glu His He Gin Lys Val Ala He Phe Gly Gly Thr His 




82 








10 










15 










20 






84 GGG AAT GAG CTA ACC GGA GTA TTT CTG GTT AAG CAT TGG CTA GAG AAT 


269 


85 Gly Asn Glu Leu Thr Gly Val Phe Leu Val Lys His Trp Leu Glu Asn 




86 






25 










30 










35 








88 GGC GCT GAG ATT CAG AGA ACA GGG CTG GAG GTA AAA CCA TTT ATT ACT 


317 


89 Gly Ala Glu He Gin Arg Thr Gly Leu Glu Val Lys Pro Phe He Thr 




90 




40 










45 










50 








365 


92 AAC CCC AGA GCA GTG AAG AAG TGT ACC AGA TAT ATT GAC TGT GAC CTG 


93 Asn Pro Arg Ala Val Lys Lys Cys Thr Arg Tyr He Asp Cys Asp Leu 




94 


55 










60 










65 












96 AAT CGC ATT TTT GAC CTT GAA AAT CTT GGC AAA AAA ATG TCA GAA GAT 


413 


97 Asn Arg He Phe Asp Leu Glu Asn Leu Gly Lys Lys Met Ser Glu Asp 




98 70 










75 










80 










85 


461 


100 TTG 


CCA 


TAT 


GAA 


GTG 


AGA 


AGG 


GCT 


CAA 


GAA 


ATA 


AAT 


CAT 


TTA 


TTT 


GGT 


101 Leu 


Pro 


Tyr 


Glu 


Val 


Arg 


Arg 


Ala 


Gin 


Glu 


He 


Asn 


His 


Leu 


Phe 


Gly 




102 






90 










95 










100 




509 


104 CCA 


AAA 


GAC 


AGT 


GAA 


GAT 


TCC 


TAT 


GAC 


ATT 


ATT 


TTT 


GAC 


CTT 


CAC 


AAC 


105 Pro 


Lys 


Asp 


Ser 


Glu 


Asp 


Ser 


Tyr 


Asp 


He 


He 


Phe 


Asp 


Leu 


His 


Asn 




106 




105 










110 










115 








108 ACC 


ACC 


TCT 


AAC 


ATG 


GGG 


TGC 


ACT 


CTT 


ATT 


CTT 


GAG 


GAT 


TCC 


AGG 


AAT 


557 


109 Thr 


Thr 


Ser 


Asn 


Met 


Gly 


Cys 


Thr 


Leu 


He 


Leu 


Glu 


Asp 


Ser 


Arg 


Asn 




110 




120 










125 










130 








605 


112 AAC 


TTT 


TTA 


ATT 


CAG 


ATG 


TTT 


CAT 


TAC 


ATT 


AAG 


ACT 


TCT 


CTG 


GCT 


CCA 


113 Asn 


Phe 


Leu 


He 


Gin 


Met 


Phe 


His 


Tyr 


He 


Lys 


Thr 


Ser 


Leu 


Ala 


Pro 




114 


135 










140 










145 










653 


116 CTA 


CCC 


TGC 


TAC 


GTT 


TAT 


CTG 


ATT 


GAG 


CAT 


CCT 


TCC 


CTC 


AAA 


TAT 


GCG 


117 Leu 


Pro 


Cys 


Tyr 


Val 


Tyr 


Leu 


He 


Glu 


His 


Pro 


Ser 


Leu 


Lys 


Tyr 


Ala 




118 150 






155 










160 










165 




120 ACC 


ACT 


CGT 


TCC 


ATA 


GCC 


AAG 


TAT 


CCT 


GTG 


GGT 


ATA 


GAA 


GTT 


GGT 


CCT 


701 


121 Thr 


Thr 


Arg 


Ser 


He 


Ala 


Lys 


Tyr 


Pro 


Val 


Gly 


lie 


Glu 


Val 


Gly 


Pro 




122 






170 










175 










180 




749 


124 CAG 


CCT 


CAA 


GGG 


GTT 


CTG 


AGA 


GCT 


GAT 


ATC 


TTG 


GAT 


CAA 


ATG 


AGA 


AAA 


125 Gin 


Pro 


Gin 


Gly 


Val 


Leu 


Arg 


Ala 


Asp 


He 


Leu 


Asp 


Gin 


Met 


Arg 


Lys 




126 






185 










190 










195 






797 


128 ATG 


ATT 


AAA 


CAT 


GCT 


CTT 


GAT 


TTT 


ATA 


CAT 


CAT 


TTC 


AAT 


GAA 


GGA 


AAA 


129 Met 


He 


Lys 


His 


Ala 


Leu 


Asp 


Phe 


He 


His 


His 


Phe 


Asn 


Glu 


Gly 


Lys 




130 




200 










205 










210 








845 


134 GAA 


TTT 


CCT 


CCC 


TGC 


GCC 


ATT 


GAG 


GTC 


TAT 


AAA 


ATT 


ATA 


GAG 


AAA 


GTT 


135 Glu 


Phe 


Pro 


Pro 


Cys 


Ala 


He 


Glu 


Val 


Tyr 


Lys 


He 


He 


Glu 


Lys 


Val 




136 


215 










220 










225 










893 


138 GAT 


TAC 


CCC 


CGG 


GAT 


GAA 


AAT 


GGA 


GAA 


ATT 


GCT 


GCT 


ATC 


ATC 


CAT 


CCT 
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139 


Asp 


Tyr 


Pro 


Arg 


Asp 


C1U 


Asn 


biy 


C1U 


lie 


Aia 


Aia 


lie 


lie 


HIS 


Pro 




140 


230 










ZJJ 










O A A 














142 


AAT 


CTG 


CAG 


GAT 


CAA 


GAC 


TGG 


AAA 


CCA 


CTG 


CAT 


CCT 


GGG 


GAT 




ATG 


y 4 1 


14o 


Asn 


Leu 


pin 

Cain 


Asp 




Asp 


irp 


Lys 


Pro 


Leu 


rlx b 


rl U 


C 1 \T 


Asp 


Dm 


lYlfc; L. 




144 










250 










255 










260 






146 


TTT 


TTA 


ACT 


CTT 


GAT 


GGG 


AAG 


ACG 


ATC 


CCA 


CTG 


GGC 


GGA 


GAC 


TGT 


ACC 


989 


147 


Phe 


Leu 


Thr 


Leu 


Asp 


Gly 


Lys 


Thr 


He 


Pro 


Leu 


Gly 


Gly 


Asp 


Cys 


Thr 




148 








265 










270 










275 








150 


GTG 


TAC 


CCC 


GTG 


TTT 


GTG 


AAT 


GAG 


GCC 


GCA 


TAT 


TAC 


GAA 


AAG 


AAA 


GAA 


1037 


151 


Val 


Tyr 


Pro 


Val 


Phe 


Val 


Asn 


Glu 


Ala 


Ala 


Tyr 


Tyr 


Glu 


Lys 


Lys 


Glu 




152 






280 










285 










290 










154 


GCT 


TTT 


GCA 


AAG 


ACA 


ACT 


AAA 


CTA 


ACG 


CTC 


AAT 


GCA 


AAA 


AGT 


ATT 


CGC 


1085 


155 


Ala 


Phe 


Ala 


Lys 


Thr 


Thr 


Lys 


Leu 


Thr 


Leu 


Asn 


Ala 


Lys 


Ser 


He 


Arg 





156 



173 
175 
176 
177 
178 
181 
182 
183 
184 
186 
187 
188 
189 
191 
192 
193 
194 
198 
199 
200 
201 
203 
204 
205 
206 
208 
209 



305 



295 300 

158 TGC TGT TTA CAT TAGAAATCAC TTCCAGCTTA CATCTTACAC GGTGTCTTAC 

159 Cys Cys Leu His 

160 310 

162 AAATTCTGCT AGTCTGTAAG CTCCTTAAGA GTAGGGTTGT GCCTTATTCA ACTGCATACA 
164 TAGCTCCTAG CACAGTGCCT TATTCGGTAG GCATCTAAGC AAATTTCTTA AATTAATTAA 
166 TATATCTTTA AAGATATCAT ATTTTATGTA TGTAGCTTAT TCAAAGAAGT GTTTCCTATT 
168 TCTATATAGT TTATTATACA TGATACTTGG GTAGCTCAAC ATTCTTAATA AACAGCCTTT 
170 GTATTCAGAA TATAAAATTG AAATAGATAT ATATAAAGTT AAAAAAAAAA AAAAAAAA 
(2) INFORMATION FOR SEQ ID NO : 2: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 313 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

FEATURE : 

(A) NAME/KEY: Modified -site 

(B) LOCATION: 83 

(D) OTHER INFORMATION: /note- "Phosphorylation site" 
FEATURE : 

(A) NAME/KEY: Modif ied-site 

(B) LOCATION: 105 

(D) OTHER INFORMATION: /note= "Phosphorylation site" 
FEATURE : 

(A) NAME/KEY: Modif ied-site 

(B) LOCATION: 108 

(D) OTHER INFORMATION: /note- "Phosphorylation site" 
FEATURE : 

(A) NAME/KEY: Modif ied-site 

(B) LOCATION: 146 

(D) OTHER INFORMATION: /note- "Phosphorylation site" 
FEATURE : 

(A) NAME/KEY: Modif ied- site 

(B) LOCATION: 264 

(D) OTHER INFORMATION: /note= "Phosphorylation site" 
FEATURE : 
(A) NAME/KEY: Modif ied- site 



(ix) 



(ix) 



(ix) 



(ix) 



(ix) 



(ix) 



1137 



1197 
1257 
1317 
1377 
1435 
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210 (B) LOCATION: 117 

211 (D) OTHER INFORMATION: /note= "Potential N-glycosylation 

212 site" 

214 (ix) FEATURE: 

215 (A) NAME/KEY: Active- site 

216 (B) LOCATION: 18. .24 

217 (D) OTHER INFORMATION: /note= "Consensus sequence 

218 predicted to be involved in catalysis" 

220 (ix) FEATURE: 

221 (A) NAME/KEY: Active -site 

222 (B) LOCATION: 275.. 278 

223 (D) OTHER INFORMATION: /note= "Consensus sequence 

224 predicted to be involved in catalysis" 

226 (ix) FEATURE: 

227 (A) NAME/KEY: Active -site 

228 (B) LOCATION: 283.. 289 

229 (D) OTHER INFORMATION: /note= "Consensus sequence 

230 predicted to be involved in catalysis" 



233 


(xi) 


SEQUENCE DESCRIPTION: SEQ ID NO: 


2: 














235 


Met 


Thr 


Ser 


Cys 


His 


He 


Ala 


Glu 


Glu 


His 


He 


Gin 


Lys 


Val 


Ala 


He 


236 


1 






5 










10 










15 




238 


Phe 


Gly 


Gly 


Thr 


His 


Gly 


Asn 


Glu 


Leu 


Thr 


Gly 


Val 


Phe 


Leu 


Val 


Lys 


239 






20 










25 










30 






241 


His 


Trp 


Leu 


Glu 


Asn 


Gly 


Ala 


Glu 


He 


Gin 


Arg 


Thr 


Gly 


Leu 


Glu 


Val 


242 




35 










40 










45 








244 


Lys 


Pro 


Phe 


He 


Thr 


Asn 


Pro 


Arg 


Ala 


Val 


Lys 


Lys 


Cys 


Thr 


Arg 


Tyr 


245 


50 










55 










60 










247 


He 


Asp 


Cys 


Asp 


Leu 


Asn 


Arg 


He 


Phe Asp 


Leu 


Glu 


Asn 


Leu 


Gly 


Lys 


248 


65 








70 










75 










80 


250 


Lys 


Met 


Ser 


Glu 


Asp 


Leu 


Pro 


Tyr 


Glu 


Val 


Arg 


Arg 


Ala 


Gin 


Glu 


He 


251 








85 










90 










95 




253 


Asn 


His 


Leu 


Phe 


Gly 


Pro 


Lys 


Asp 


Ser 


Glu 


Asp 


Ser 


Tyr 


Asp 


He 


He 


254 








100 










105 










110 






256 


Phe 


Asp 


Leu 


His 


Asn 


Thr 


Thr 


Ser 


Asn 


Met 


Gly 


Cys 


Thr 


Leu 


He 


Leu 


257 




115 










120 










125 








259 


Glu 


Asp 


Ser 


Arg 


Asn 


Asn 


Phe 


Leu 


He 


Gin 


Met 


Phe 


His 


Tyr 


He 


Lys 


260 




130 










135 










140 










262 


Thr 


Ser 


Leu 


Ala 


Pro 


Leu 


Pro 


Cys 


Tyr 


Val 


Tyr 


Leu 


He 


Glu 


His 


Pro 


263 


145 










150 










155 










160 


265 


Ser 


Leu 


Lys 


Tyr 


Ala 


Thr 


Thr 


Arg 


Ser 


He 


Ala 


Lys 


Tyr 


Pro 


Val 


Gly 


266 






165 










170 










175 




268 


He 


Glu 


Val 


Gly 


Pro 


Gin 


Pro 


Gin 


Gly Val 


Leu 


Arg 


Ala 


Asp 


He 


Leu 


269 








180 










185 










190 






271 


Asp 


Gin 


Met 


Arg 


Lys 


Met 


lie 


Lys 


His 


Ala 


Leu 


Asp 


Phe 


He 


His 


His 


272 




195 










200 










205 








274 


Phe 


Asn 


Glu 


Gly 


Lys 


Glu 


Phe 


Pro 


Pro 


Cys 


Ala 


He 


Glu 


Val 


Tyr 


Lys 


275 




210 










215 










220 










277 


He 


He 


Glu 


Lys 


Val 


Asp 


Tyr 


Pro 


Arg 


Asp 


Glu 


Asn 


Gly 


Glu 


He 


Ala 


278 


225 








230 










235 










240 
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280 Ala lie lie His Pro Asn Leu Gin Asp Gin Asp Trp Lys Pro Leu His 

281 245 250 255 

283 Pro Gly Asp Pro Met Phe Leu Thr Leu Asp Gly Lys Thr lie Pro Leu 

284 260 265 270 

286 Gly Gly Asp Cys Thr Val Tyr Pro Val Phe Val Asn Glu Ala Ala Tyr 

287 275 280 285 

289 Tyr Glu Lys Lys Glu Ala Phe Ala Lys Thr Thr Lys Leu Thr Leu Asn 

290 290 295 300 

292 Ala Lys Ser lie Arg Cys Cys Leu His 

293 305 310 

295 (2) INFORMATION FOR SEQ ID NO: 3: 

297 (i) SEQUENCE CHARACTERISTICS: 

298 (A) LENGTH: 313 amino acids 

299 (B) TYPE: amino acid 

300 (D) TOPOLOGY: linear, 

303 (ix) FEATURE: 

304 (A) NAME/KEY: Region 

305 (B) LOCATION: 6 

306 (D) OTHER INFORMATION: /note= "This is isoleucine in 

307 human, valine in bovine. This is a very 

308 conservative substitution." 

310 (ix) FEATURE: 

311 (A) NAME/KEY: Region 

312 (B) LOCATION: 9 

313 (D) OTHER INFORMATION: /note= "This is glutamic acid in 

314 human, aspartic acid in bovine. This is a very 

315 conservative substitution." 

317 (ix) FEATURE: 

318 (A) NAME/KEY: Region 

319 (B) LOCATION: 10 

320 (D) OTHER INFORMATION: /note= "This is histidine in human, 

321 proline in bovine. This is a conservative 

322 substitution." 

326 (ix) FEATURE: 

327 (A) NAME/KEY: Region 
32 8 (B) LOCATION: 12 

329 (D) OTHER INFORMATION: /note= "This is glutamine in human, 

330 lysine in bovine. This is a very conservative 

331 substitution. " 

333 (ix) FEATURE: 

334 (A) NAME/KEY: Region 

335 (B) LOCATION: 38 

336 (D) OTHER INFORMATION: /note= "This is glycine in human, 

337 serine in bovine. This is a very conservative 

338 substitution." 

340 (ix) FEATURE: 

341 (A) NAME/KEY: Region 

342 (B) LOCATION: 39 

34 3 (D) OTHER INFORMATION: /note= "This is alanine in human, 
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L:33 M:220 C: Keyword misspelled or invalid format, [(A) APPLICATION NUMBER:] 
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L: 


527 


M 


341 


W 


(46) 


"n" 


or 


n Xaa " 


used, 


tor 


SEQ 


ID# 


3 


L: 


533 


M 


341 


W 


(46) 


"n" 


or 


" Xaa " 


used, 


tor 


SEQ 


ID# 


3 


L: 


539 


M 


341 


w 


(46) 


" n " 


or 


"Xaa" 


used, 


for 


SEQ 


ID# 


3 


L. 


542 


M 


341 


w 


(46) 


"n" 


or 


"Xaa" 


used, 


tor 


SEQ 


ID# 


3 


L : 


DDI 


M 


Q A 1 


w 


/ A C \ 

(46) 


ii « ii 
n 


or 


Add 


used , 


for 




XiJff 


J 


L- 


563 


M 


341 


w 


(46) 


"n" 


or 


"Xaa" 


used, 


for 


SEQ 


ID# 


3 


L 


569 


M 


341 


w 


(46) 


"n" 


or 


"Xaa" 


used, 


for 


SEQ 


ID# 


3 


L: 


572 


M 


341 


w 


(46) 


"n" 


or 


"Xaa" 


used, 


for 


SEQ 


ID# 


3 


L. 


576 


M 


341 


w 


(46) 


"n" 


or 


"Xaa" 


used, 


for 


SEQ 


ID# 


3 


L: 


579 


M 


341 


w 


(46) 


"n" 


or 


"Xaa" 


used, 


for 


SEQ 


ID# 


3 


L: 


585 


M 


341 


w 


(46) 


"n" 


or 


"Xaa" 


used, 


for 


SEQ 


ID# 


3 


L 


845 


M 


341 


w 


(46) 


"n M 


or 


"Xaa" 


used, 


for 


SEQ 


ID# 


12 


L * 


884 


M 


341 


w 


(46) 


"n" 


or 


"Xaa" 


used, 


for 


SEQ 


ID# 


14 


L: 


887 


M 


341 


w 


(46) 


"n" 


or 


"Xaa" 


used, 


for 


SEQ 


ID# 


14 


L ■ 


911 


M 


341 


w 


(46) 


"n" 


or 


"Xaa" 


used, 


for 


SEQ 


ID# 


15 


L , 


932 


M 


341 


w 


(46) 


"n" 


or 


"Xaa" 


used, 


for 


SEQ 


ID# 


16 



L:1162 M:341 W: (46) "n" or "Xaa" used, for SEQ ID#:25 
L:1190 M:341 W: (46) "n" or "Xaa" used, for SEQ ID#:26 
L:1212 M:341 W: (46) "n" or "Xaa" used, for SEQ ID#:27 
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